analysis object
SoUnD Framework: Analyzing (So)cial Representation in (Un)structured (D)ata
Díaz, Mark, Dev, Sunipa, Reif, Emily, Denton, Emily, Prabhakaran, Vinodkumar
The unstructured nature of data used in foundation model development is a challenge to systematic analyses for making data use and documentation decisions. From a Responsible AI perspective, these decisions often rely upon understanding how people are represented in data. We propose a framework designed to guide analysis of human representation in unstructured data and identify downstream risks. We apply the framework in two toy examples using the Common Crawl web text corpus (C4) and LAION-400M. We also propose a set of hypothetical action steps in service of dataset use, development, and documentation.
Country:
- North America > United States > New York > New York County > New York City (0.04)
- South America (0.04)
- North America > Dominican Republic (0.04)
- (5 more...)
Technology:
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)